Skip to content

optimize(api): add warning logs for load-based request rejection#2972

Merged
imbajin merged 6 commits intoapache:masterfrom
contrueCT:fix/issue-2747-load-detect-logs
Apr 25, 2026
Merged

optimize(api): add warning logs for load-based request rejection#2972
imbajin merged 6 commits intoapache:masterfrom
contrueCT:fix/issue-2747-load-detect-logs

Conversation

@contrueCT
Copy link
Copy Markdown
Contributor

Purpose of the PR

Main Changes

Log busy-worker and low-memory rejections in LoadDetectFilter before returning ServiceUnavailableException so operators can diagnose 503 responses from server-side logs.

Add unit coverage for whitelist bypass, busy rejection,
low-memory rejection, and healthy request pass-through, and register the new test in UnitTestSuite.

Verifying these changes

  • Trivial rework / code cleanup without any test coverage. (No Need)
  • Already covered by existing tests, such as (please modify tests here).
  • Need tests and can be verified as follows:
    • xxx

Does this PR potentially affect the following parts?

Documentation Status

  • Doc - TODO
  • Doc - Done
  • Doc - No Need

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds operator-visible warning logs when LoadDetectFilter rejects requests due to high worker load or low free memory (503s), and introduces unit tests to cover the new rejection behavior plus whitelist/healthy pass-through.

Changes:

  • Add WARN logging for worker-load and low-memory request rejections in LoadDetectFilter.
  • Add LoadDetectFilterTest unit coverage for whitelist bypass, busy rejection, low-memory rejection, and healthy requests.
  • Register the new unit test in UnitTestSuite.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
hugegraph-server/hugegraph-api/src/main/java/org/apache/hugegraph/api/filter/LoadDetectFilter.java Adds rejection WARN logs and minor refactor (captures current load).
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/api/filter/LoadDetectFilterTest.java New unit tests for whitelist/busy/low-memory/healthy scenarios.
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/UnitTestSuite.java Adds LoadDetectFilterTest to the unit suite.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@VGalaxies VGalaxies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. I left two small review comments; CI looks green on my side, but I think these are still worth addressing.

@contrueCT
Copy link
Copy Markdown
Contributor Author

Thanks for the update. I left two small review comments; CI looks green on my side, but I think these are still worth addressing.

⚠️ The busy-load branch is rate-limited, but this low-memory WARN will still fire on every rejected request. Under sustained memory pressure that can flood the logs and add more I/O pressure to an already stressed server. Please apply the same REJECT_LOG_RATE_LIMITER here, or route both rejection logs through a shared sampling helper.
⚠️ This test drives the real System.gc() path by forcing the low-memory branch. That makes the suite slower and can make it behave differently on constrained CI nodes or JVMs with different GC ergonomics. Consider adding a test seam for the GC step so the test can stub it and only verify the rejection behavior.

Changes made:

  • Applied the same reject-log rate limiting to the low-memory branch, and routed both rejection paths through a shared warning helper so the sampling behavior is consistent.
  • Added a small test seam for gcIfNeeded() / reject-log sampling so the unit test no longer depends on a real System.gc() call.
  • Expanded LoadDetectFilterTest to verify:
    • busy rejection emits a warning log
    • low-memory rejection emits a warning log
    • whitelist / healthy paths do not emit rejection logs
    • reject-log sampling suppresses repeated warning logs as expected

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 2, 2026
@contrueCT contrueCT requested a review from imbajin April 2, 2026 13:22
LOG.warn("Rejected request due to low free memory, method={}, path={}, " +
"presumableFreeMemMB={}, recheckedFreeMemMB={}, gcTriggered={}, " +
"minFreeMemoryMB={}",
context.getMethod(), context.getUriInfo().getPath(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ 内存拒绝路径中的逻辑问题:GC 后未重新检查内存

当前逻辑在 GC 后直接抛出异常,并没有重新检查内存是否已恢复到阈值以上。既然已经在这里做了 recheckedFreeMem 计算,建议更进一步:GC 后重新检查内存,如果已恢复则放行请求,减少不必要的 503。

当前行为:检测到内存不足 → 尝试 GC → 记录 GC 后的内存 → 仍然抛异常
建议行为:检测到内存不足 → 尝试 GC → 重新检查 → 如果恢复则放行

Suggested change
context.getMethod(), context.getUriInfo().getPath(),
boolean gcTriggered = this.gcIfNeeded();
if (gcTriggered) {
long allocatedMemAfterGc = Runtime.getRuntime().totalMemory() -
Runtime.getRuntime().freeMemory();
long freeMemAfterGc = (Runtime.getRuntime().maxMemory() -
allocatedMemAfterGc) / Bytes.MB;
if (freeMemAfterGc >= minFreeMemory) {
this.logRejectWarning(
"Low memory recovered after GC, method={}, path={}, " +
"beforeFreeMB={}, afterFreeMB={}",
context.getMethod(), context.getUriInfo().getPath(),
presumableFreeMem, freeMemAfterGc);
return;
}
}
this.logRejectWarning(
"Rejected request due to low free memory, method={}, path={}, " +
"presumableFreeMemMB={}, gcTriggered={}, minFreeMemoryMB={}",
context.getMethod(), context.getUriInfo().getPath(),
presumableFreeMem, gcTriggered, minFreeMemory);

List<PathSegment> segments = context.getUriInfo().getPathSegments();
E.checkArgument(!segments.isEmpty(), "Invalid request uri '%s'",
context.getUriInfo().getPath());
String rootPath = segments.get(0).getPath();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ 方法可见性变更:gcIfNeededprivate static 改为 protected 实例方法

原方法是 private static void gcIfNeeded(),现改为 protected boolean gcIfNeeded()。变更影响:

  1. 扩大可见性(private → protected),破坏了封装性
  2. 从静态方法变为实例方法
  3. 主要目的似乎是为了测试时 override

建议保持 private 可见性,如果需要测试可测试性,可以:

  • 使用 package-private 可见性 + @VisibleForTesting 注解
  • 或者通过注入策略接口实现

同样的建议也适用于 allowRejectLog() 方法。

@@ -54,11 +58,40 @@ public class LoadDetectFilter implements ContainerRequestFilter {
private static final RateLimiter GC_RATE_LIMITER =
RateLimiter.create(1.0 / 30);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ 两条拒绝路径共享同一个 REJECT_LOG_RATE_LIMITER

高负载拒绝和低内存拒绝共享同一个 RateLimiter(每秒 1 个 permit)。这意味着:

  • 如果高负载拒绝消耗了 permit,紧接着的低内存拒绝就不会被记录(反之亦然)
  • 运维人员在同时出现高负载和低内存时,可能只看到一种告警,遗漏另一种

建议为两种拒绝原因使用独立的 RateLimiter:

private static final RateLimiter BUSY_LOG_LIMITER = RateLimiter.create(1.0);
private static final RateLimiter MEMORY_LOG_LIMITER = RateLimiter.create(1.0);

return REJECT_LOG_RATE_LIMITER.tryAcquire();
}

protected void logRejectWarning(String message, Object... args) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ logRejectWarning() 与直接调用 allowRejectLog() 的不一致

高负载路径使用封装的 logRejectWarning(),但低内存路径直接调用 allowRejectLog() + LOG.warn()

// 高负载路径
this.logRejectWarning("Rejected request due to high worker load...");

// 低内存路径
boolean shouldLog = this.allowRejectLog();
if (shouldLog) { LOG.warn(...); }

两条路径的日志方式不一致,降低了可读性。建议统一为同一种模式。另外 logRejectWarning() 在生产代码中只被调用了一次,是否真的需要抽取为独立方法值得考虑。

boolean shouldLog = this.allowRejectLog();
boolean gcTriggered = this.gcIfNeeded();
if (shouldLog) {
long allocatedMemAfterCheck = Runtime.getRuntime().totalMemory() -
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 recheckedFreeMemgcTriggered=false 时是冗余信息

gcTriggered=false 时,recheckedFreeMempresumableFreeMem 几乎完全相同(两次采样之间几乎没有时间差),日志中两个近似相同的值会造成困惑。

建议只在 gcTriggered=true 时才计算并记录 recheckedFreeMem,否则省略此字段。

Copy link
Copy Markdown
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good enhancement

@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Apr 25, 2026
@imbajin imbajin merged commit 68dd29b into apache:master Apr 25, 2026
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improvement] Add critical log statements.

4 participants